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CO Abstract: This paper is the second in a series of two on the problem of estimating a function of a 

O 

ON 

c3 Bayes estimator for a function of a probability distribution was introduced, the optimal properties 
of the Bayes estimator were discussed, and the Bayes and frequency-counts estimators for the Sh- 



probability distribution from a finite set of samples of that distribution. In the first paper 1 , the 



annon entropy were derived and graphically contrasted. In the current paper the analysis of the first 
paper is extended by the derivation of Bayes estimators for several other functions of interest in 
statistics and information theory. These functions are (powers of) the mutual information, chi- 
squared for tests of independence, variance, covariance, and average. Finding Bayes estimators for 
several of these functions requires extensions to the analytical techniques developed in the first pa- 
per, and these extensions form the main body of this paper. This paper extends the analysis in other 
ways as well, for example by enlarging the class of potential priors beyond the uniform prior as- 
sumed in the first paper. In particular, the use of the entropic and Dirichlet priors is considered. 



PACS numbers: 02.50.+S, 05.20.-y 



1. BACKGROUND 



Consider a system with m possible states and an associated m-vector of probabilities of those 
states, p = (pj) , 1 < i < m, (£™ = jPj = 1 ). The system is repeatedly and independently sampled 
according to the distribution p. Let the total number of samples be N and denote the associated 
vector of counts of states by n = (n t ) , 1 < i < m, (E™ - = N). By definition, n is multino- 

mially distributed. In some cases in this paper the states will be indexed by two integers. For these 
cases p and n are matrices. 

In many cases what we are interested in is not p , but rather some function of p , F(p) . The 
problem at hand is to estimate such a function F(p) from the data n. More precisely, the problem 
is to investigate the posterior density of F(p), i.e. the probability density function (pdf) 

P(F(p) = f I n) = jdp 8(F(p) - f) P(p I n). (1) 

Usually it is difficult to analytically compute P(F(p) = f I n) . Accordingly, here we instead 
compute posterior moments of F(p) , i.e., the moments of F(p) according to the probability density 
in Eqn. (1). 

These posterior moments do more than simply give us a characterization of P(F(p) = f I n) . 
For example, in [1] we show that the estimator G(n) that minimizes the mean-squared error from 
the true F(p) is given by the first such posterior moment, the posterior average: 

G(n) = E [F(p) I n] = Jdp F(p) P(p I n) = Jdf f P(f I n). (2) 

In other words, this G(n) minimizes 

Jdp P(p) E n P(n I p) x (G(n) - F(p)) 2 , (3) 

where P(p) is the so-called "prior" pdf of p . In this sense, this G(n) is the optimal estimator for 
F(p) . The estimator E [F(p) I n] is known as the Bayes estimator for F(p) with prior P(p) . 
The second posterior moment is also useful. As was mentioned in [1], by using Chebyshev's 
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inequality, the second posterior moment can be used to bound the probability that F(p) deviates 
substantially from the Bayes estimator for F(p) . 

We now introduce the functions F(p) for which we derive Bayes estimators in this paper. The 
functions F(p) considered here are (the first two) powers of the mutual information, chi- squared 
for tests of independence, covariance, variance, and average. (In some cases the methods of this 
paper will allow consideration of all powers of these functions.) This choice of functions is not 
meant to be exhaustive. Rather it is meant to both exemplify some of the mathematical techniques 
involved in calculating Bayes estimators of function F(p), and to provide some useful results. The 
techniques of this paper should be applicable to many other functions of interest as well. 

The mutual information is defined in terms of a matrix p by 

M(p) = S((p..)) + S((p..))-S(p) (4) 

Here (p ; .) and (p.p are the vectors of column and row sums of p = (p-) respectively, i.e. 
Pj. = EjP^ and similarly for (p.p . S(p) is the usual Shannon entropy: S(p) = -E^p^log (p^) , 
while S( (pj.) ) = -SjPj. log (pj. ) , and similarly for S( (p ) ). Mutual information is a measure of 
the amount of information shared between two symbol streams (symbolic dynamical systems) with 
joint probability p.j [2] . It may also be seen as a measure of the correlation between two symbolic 

systems with joint probability p.. [3]. The mutual information function has applications in areas 

such as communication theory [2] (e.g. the measurement of channel capacity), pattern recognition 
[4], and natural languages [5], to name but a few. 

The chi- squared statistic for independence is also given by a function of a matrix p, 

% 2 (p)^Z ij ((p lj -p 1 .p. J ) 2 /(p 1 .p. J )), (5) 

Chi-squared is commonly used in statistical tests of independence [6], where it appears in a form 
with the maximum-likelihood estimator of p substituted for p in Eq. (5). The form in Eq. (5) is pro- 
portional to the asymptotic (large data set) statistic used in these tests, and it is easily shown to be 
a first order approximation to the mutual information under certain conditions [7-8]. 
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The covariance function of a matrix p is given by 

Cav^^.p.^X.-iLj (Y r » y ) , (6) 

where each of the m possible states is associated with some ordered pair (X-, Yj) of numbers 
(there are m index pairs (i, j) altogether). The ij 'th state occurs with probability p- . The means 
are (l x and (l y ; (l x = Z ] p i X i and similarly for (i y [6]. 
The variance function of a vector p is given by 

VarCp^E^CX;-^) 2 , (7) 
where the i'th state is associated with the number X ; and occurs with probability pj. 
Finally, the average is a function of a vector p given by 

Avg^E.PiXj. (8) 

In Sec. 2 we derive Bayes estimators with uniform prior for these functions of probability dis- 
tributions. The notation used in these results is summarized in subsections 2a and 2b. The results 
themselves, the Bayes estimators for the various functions being considered, are summarized in 
subsections 2d and 2g. In Sec. 3 we show how these Bayes estimator results along with those of 
[1], all derived under the assumption of a uniform prior, are modified when various different priors 

are assumed. In particular, we discuss the entropic prior P(p) °c e aS ^ and a broad class of priors 

which includes the Dirichlet prior, P(p) n™ = jp[. 

Throughout this paper it is assumed that the reader is at least passingly familiar with the anal- 
ysis in [1], the first paper in this two-paper series. 

2. CALCULATIONS FOR THE BAYES ESTIMATORS 

In this section we present the calculations needed to derive the first two moments of the func- 
tions discussed in Sec. 1. The subsections are organized as follows. In Sec. 2a we discuss the form 
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of the integrations to be done. Sec. 2b contains a presentation of notation motivated by considering 
the special case where F(p) is the mutual information raised to some power. Section 2c contains 
intermediate results, Thms. 9-15. In Sec. 2d we use the results of Sec. 2c to derive the Bayes esti- 
mators for the first two powers of most of the functions described in Sec. 1. The results appear as 
Thms. 16-21. Sections 2e-g parallel Sees. 2b-d (notation, intermediate results (Thms. 22-25), and 
Bayes estimators (Thms. 26 and 27)), but for integrals more complicated than those considered in 
Sees. 2b-d. 

The reader interested only in the results for the Bayes estimators for the F(p) described in 
Sec. 1 should see Sees. 2d and 2g. 

2a. THE FORM OF THE INTEGRALS. 

Recall from Sec. 1 that the Bayes estimator for F(p) is the posterior average 

E[F(p)ln] = Jdp P(pln)F(p). Let A(p) = 8(L.pj - 1) and 0(p) = n.e(pj), where 0(x) = 1 

for x > 0, otherwise. Define I [F(p), n] by 

I [F(p), n] = JdpF(p)A(p)0(p)n™ jp? 1 . (9) 

Using this notation, when the prior P(p) is uniform, i.e. when P(p) A(p)0(p), it is easily shown 
(see [1]) that 

E[F(p)ln] = I[F(p),n]/I[l,n]. (10) 
The result for I [ 1, n] appears in Thm. 3 of [1]. (In general, references to Thms. 1-8 are to the the- 
orems so numbered in [1]; this paper, being a continuation of [1], starts numbering its theorems at 
9.) Therefore, when the prior is uniform, finding the Bayes estimator E [F(p)l n] reduces to eval- 
uating the integral I [F(p), n] . In the rest of this section, this integral is calculated for each of the 
F(p) mentioned in Sec. 1. 

2b. NOTATION. 
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To understand the types of calculations that must be performed, consider the case where 
F(p) = M k (p), M(p) being the mutual information. In this case the integral of interest is 



I[M k (p),n] = J dp IT- lP "'x [-E. Pi .log(p.) -Epilog (p. p + E ijPij log ( Plj ) ] k (11) 



(Here and in the rest of this paper we do not explicitly write the A(p)0(p) factor in the integrand. 
Rather, it is indicated by subscripting the integrals with A p .) The right-hand side of Eq. (11) ex- 
pands to a sum of integrals of the form I [Pj'log r '(p 1 )...pjMog rk (p k ), n] , with q { = r t , and with 
each p i a sum of a subset of the pj's. 

Since such sums of subsets of p ; 's will often arise, we introduce some special notation. Indicate 
a subset of indices of the p ; 's by o, and the sum of the p ; 's with indices i e o by p = E. g oPi . If 
there are k such subsets, these will be represented by a u , u = 1, ...,k, and the corresponding sub- 
set sums will be given by p u = E. g p i5 u = 1, . . ., k. In the case where a u n a y = for u * v, 

u 

the subsets will be called non-overlapping. If on the other hand a u n o v ^ and yet o u ^ G v , the 
subsets will be called overlapping. Any expression involving non-overlapping (overlapping) sub- 
set sums will be called a non-overlapping (overlapping) term. Pair-wise overlapping will be used 
to indicate a term involving two subsets which are overlapping (as opposed to multiple such sub- 
sets). 

Since we will often be dealing with overlapping subsets, the notation G uy will be used for 

G u n G y . (No confusion arises since we never explicitly refer to O u with u a double-digit number.) 
Similarly, we use the notation o = a - a to refer to those indices in a but not in a , 

J ' u-uv U UV ll "V 



a , = a u a to refer to those indices in either a and/or in o , and obvious extensions when 

U + V U V u v' 

more than two sets are being considered. 



LANL LA-UR 93-833 



6 



SFI TR-93-07-047 



Note that in Eq. (1 1), for k = 1 the integral reduces to a sum of integrals of non-overlapping 
terms. For k = 2 some of the integrands are pair-wise overlapping terms. However for k = 2 no 
term occurs that involves more than two overlapping subsets. 

We will use the following conventions for various other quantities which arise. For generic 
numbers appearing as exponents in convolution expressions the variable a will be used. Just as we 
previously defined p's to refer to sums of those pj picked out by index subsets a, we also need 
notation for the sums of n^s picked out by such g's. The variable (3 will be used to indicate these 
sums, with the convention discussed above for subscripts on p holding for (3's subscripts; 
B = E. (n- + 1) . It will also sometimes be useful to use the notational convention v. = n- + 1 , 

u 

i = 1, . . ., m. In the case where n and p are matrices instead of vectors we use v .. = n- • + 1 . Sim- 

f ij ij 

ilarly, to denote row or column sums of n ; -'s we use v. = E.V.. and v . = £.V... 

We will also have need of the following notation: (3 = V = E™ = ,V.; Y n = nP = ^(Vj); 

T| = E^ = 1 r| u (ri u is associated with subset g u , u = 1, ...,k); 4> (n) (z) = v I ,(n ~ 1} (z), where 

¥ (n) (z) is given by ¥ (n) (z) = d n z + l \og (T(z)) (see [9], Eq. 6.3.1); and the "delta-phi" function 

is given by A4> (n) (z v z 2 ) = 4> (n) ( Zl ) - $ (n) (z 2 ) . 

Notation for the hypergeometric functions used here appears in App. A. In particular, the spe- 
cial use of subscripts on parentheses (e.g., (a) b ) is defined there, as are functions of the form p F q 

and functions of the form _ _ _ F n n n . 

Pl> P 2 . Pl2 4l> 42' Ml2 

As it stands, the integrand in Eq. (11) is not "factorable" as defined in Sec. 4b of [1]. Having 
an integrand in factorable form is desirable because it allows us to apply the procedure of Sec. 4b 
of [1] to evaluate the integral. In what follows we utilize the T transform (see App. B.l) to convert 
the integrand of Eq. (11) (and similar integrands to appear) into factorable form. Once the inte- 
grand is factorable, we can perform the integration. After the integration we inverse T-transform 
to arrive at the final result for the integral of interest. 
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2c. NON-OVERLAP AND PAIR- WISE OVERLAP CONVOLUTION INTEGRALS. 

In this subsection we derive a number of intermediate results. We begin this subsection with 
the fundamental convolution integrals needed to derive non-overlap and pair- wise overlap convo- 
lutions (Thms. 9-11). We then present specific integrals I [ ■ , ■ ] of these two types (Thms. 12-15). 
Theorems 12 and 13a,b apply to non-overlap terms; Theorems 14a,b and 15a,b apply to pair-wise 
overlap terms. In Sec. 2d we use these intermediate results to evaluate Bayes estimators for several 
of the functions F(p) being considered. 

Define the Laplace convolution operator "®" acting on functions f and g by 

(f®g)(x)=Jdx f(x)g(x-x). (12) 

o 

In the following, the convolution operator is (usually) implicitly assumed to refer to functions of 
the variable p. As described in [1], this convolution operator is fundamental to evaluating I [■ , ■ ] 
integrals. 

Theorem 9. If Re (a) > 0, i = 1, 2. Then 

/r« is , "i- 1 ^ ^-iw % r ( a i)r( a 2 ) , j 

(9.1) (p' ®p= )M= r( a 1 + « 2 ) XT ' 

(9.2) (p°'-'e->>'<E>p a '-WT) = ^gxT« l + °.-'xe-". 

Proof: To Prove (9.1) note that by Thm. 2 (the Laplace convolution theorem) and the fact 

ra _ i r(a) 
that L [p ] (s) = — ^ for Re (a) > we have 

s 

Ot — 1 a 9 -l t -1 ft r a i _1 iT r a 2 _1 ii , \ 

(p 1 ®p )(x) = L 1 [L[p 1 ]L[p ]](x) 
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= L" 1 



rxoc^rxcg 



a, a, 
s s 



mjnaj a+ot2 _ 1 

(x) ~ r(a 1 + a 2 ) xx 



a - 1 — nt r(oc) 

To prove (9.2) note that L [p e p ] (s) = . The remainder of the proof parallels the 

(s + t) a 

proof of (9.1) above: Substitute L [p a " V pt ] for L [p a ] . QED. 

If desired, Thm. 3 may be thought of as corollary to Thm. 9.1, by induction. 

In order to discuss the next results succinctly, we need to use the confluent hypergeometric 
function, discussed in App. A. 

Theorem 10. If Re (ctj) > 0, Re (oc 2 ) > 0, and a = a l + a 2 then 
(10.1) 

, a.-l - p t.^ a,-l - p (t +t,) w , r(a i )r(a 2 ^ a-1 -x(t,+t 2 ) 

(p 1 e F '®p 2 e F 1 2 )(x) = — xx" xe 1 2 

r r T(a) 

x jF^a,; a; t 2 x) 

(10.2) 

(p a > ^p" 2 V^Hx) = 2 xx^-'xe^x^a - a; tx) 



Proof: To prove (10.1) write the convolution (p" 1 e ptl ®p a2 e p ' tl + t ^)(x) in its 



integral form as e x(tl + t2) | e pt2 p a ' 1 ( x _p) a 2 ^p. M a k e the change of variables xu = p, 

o 



which introduces the factor x a , Substitute oc = a 1 ,|3 = a,z = t 2 x and compare with the 

Hp) 

form in App. A to find the result. The factor — N _,, n in the hypergeometric function in 

r(a)r(p-a) 



LANL LA-UR 93-833 9 SFI TR-93-07-047 



r(aj)r(a 2 ) 

App. A cancels the factor Yfjty " ^° P rove (10.2) substitute zero for tj in result (10.1). 

QED. 



Theorem 1 1 gives the primary equality needed to derive pair-wise overlap results. See App. A 
for the definition of the generalized hypergeometric function j j F Q „ j . 

Theorem 11. If Re (04) > 0, Re (a 12 ) > 0, Re (a 2 ) > and a = a l + a n + a 2 , then 

a. - 1 -pt,_ CX,~ — 1 -pfti+t,)- a, -1 -pt, w . 
(p 1 e '®p 12 e 1 2 ®p 2 e )(x) = 



r(a 1 )r(a 12 )r(a 2 ) al _ T(t|+t2) 



r(a) 



xx x e 



_a,-l -ptj a 12 -l - P (t 1 + t 2 ) 



Proof : The convolution p e ®p e ~ was done in Thm. 10.1. Expand 

in that result in its series representation (see App. A) 

<«i> k _ (t 2 x) k 

k 



^(04; c^ + a^; t 2 x) - L k = j—— ) r(k+1) 



and convolve it with p" 2 e pt2 term-by-term (apply Thm. 10.2) to find the desired result. (This 
is valid since the series is uniformly convergent on [0, x] .) QED. 



Thms. 9-11 are now used to derive integrals I[-, ] for some non-overlap and pair-wise overlap 
terms. Theorems 12 and 13 discuss the non-overlap case, while Thms. 14 and 15 discuss the pair- 
wise overlap case. 

Theorem 12. If the subsets a , defined for u = 1, k, satisfy a = for all u ^ v, 

u' ' * ' J uv 

and if Re (p u + ri u ) > for all u = 1, k, and if Re (v.) > for all 
i = 1, m then 
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Proof: Assume k = 1 and Re (r^) < to begin. Apply T _1 T with respect to r\ l to the 
integral I [Pj , n] (see App. B.l for the definition of the T transform) and evaluate the inner T 



-Pi* 



-Pit 



transform. Noting e rr = II. g e ri ' and that T [z' 1 ] (t) = e zt for r\ l < 0, find 



I[p!\n] = T" 1 



{dp(n ieo pf i e- pit )x ( n igo pn 



(See App. C. 1 for the justification of the interchange the integral over p and the T transform.) Now, 
write the transformed integral above as the convolution 

I[p?\n] = T- 1 ((® ie£Ji p 1 ni e" Pit )®(® igo pf i ))(l). 

Use Thm. 9.1 and induction to find (with |3 = |3 - Pj) 

n ieo, v r q i 

r((3) 

Similarly, use Thm. 9.2 and induction to find 

n „r(v.) 

Hj - Pi t. , , i i p, - 1 -xt 

(®i e0 Pi e >W = r(p) XX ' Xe • 

Substituting the last two expressions into I [p 1 n] yields 



7,1 T-^x^^e-^xP-^d). 



I[p'\n] = 

The T" 1 transform may now be taken to find (see Apps. A, C) 



I[Pi h ,n] 



Y " (x^^'-^xP-^d). 



np.)r(P) 



Now apply Thm. 9.1 in this expression to find for r\ . < 
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I[pi',n] = 



Refer to App. D for the continuation to Re (r\ j ) > . Refer to App. E for the existence conditions. 

Now, for k > 1 apply the identity operator T _1 T k times (with respect to r\ . . . .r\ k respectively) and 
evaluate only the T transforms initially. Since O uy = for u ^ v, the convolution form of the 

transformed I [Pj , n] becomes 

ItpV.p^n] = T^./T^((®* =1 (® p!V Pit »))®(® P"'))(1) 5 

u u - 1 u 

while (3 becomes P = P — P j + +k - Extend the application of Thm. 9.2 to the k convolution prod- 
ucts ® ie p°'e p,t for u = 1, k. Do the substitutions and take the inverse transforms to find 

u 

the result. QED. 

Footnote 1 contains a derivation of an interesting identity based on an alternate form of the re- 
sult in Thm. 12. Since observations being integer counts in practice, we'll be interested in non-neg- 
ative integer n^ in this regard, Thm. 12 is more general than we need. (However, below we will 

want to differentiate with respect to T] 's, to get logarithms into the integrand. So a result only ap- 
plicable to integer T| 's would not suffice.) 

Theorem 13 applies Thm. 12 to find non-overlap results needed specifically for the expression 
of Bayes estimators for the first two moments of the entropy, mutual information, and various other 
functions. 

Theorem 13. If the subsets o u , defined for u = 1, k, satisfy o uy = for all u ^ v, 

and if Re (p u + r[ u ) > for all u = 1, k, and if Re (v ; ) > for all 

i = 1, . . ., m then the following hold. 
(13.1) One logarithm of a subset sum. 
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y r(p +ti ) 

I[(pJ I ...p; k )xlog(p B ).n] = ^xtf^ xA^^CP^^p + n). 

(13.2) Two logarithms of subset sums, u ^ v. 

Y r(B +n ) 

I[(p 1 h ...p k lk ) xlog(p u ) xlog(p v ),n] = xl^ = 1 

x {A$ (1) ((3 u + ri u) (3 + ri)A$ (1) ((3 v + ri v) B + ri)-4> (2) (B + ri)}. 

(13.3) Squared logarithm of a subset sum. 

I[(p 1 1 '...p k ^xlog(p u ) 2 ) n] = ^xtf =1 

x {A<£ (1) (B u + r| u , B + ri) 2 + A$ (2) (B u + ri u , B + r))} . 
Proof: The proof is done for result (13.1); results (13.2) and (13.3) follow in a similar man- 

Tl Tl 

ner. Differentiate both sides of the formula for I [p 1 . . . p \ n] given in Thm. 12 with respect to 
T| u using the fact that 5 r) p ri = p^log (p) . (See App. C.2 for justification of the interchange the 
integral and derivative.) Doing this gives the desired result. QED. 



Theorems 12 and 13 dealt with non-overlap sums. Theorems 14 and 15 below discuss pair- wise 
overlap sums. In Thm. 14a the non-contained overlap case is discussed. In Thm. 14b the contained 
overlap case is discussed. See App. A for the definition of the hypergeometric function 2 2 F Q Q l . 

Theorem 14a. If o l and o 2 satisfy a 12 ^ 0, o l * o 12 * a 2 , Re (Bj + r\ l ) > 0, 
Re (B 2 + r| 2 ) > 0, and Re (v.) > 0, i = 1, m, then 

Tr n, n 2 , ^ r (B 1 + 2 + ri) 
Itp/'p^n] = r/n ■ ^ x 



np+Ti) np 1+2 ) 

x 2,2,o F o,o,i[(Pi-i2'- r l2)' CP 2 _ 12 ,-'n 1 );P 1+2 ;l, 1] • 
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Proof: To begin, assume that Re (r]^ < 0, i = 1,2 and that the r\ i are not integers. Apply 

T^ T 2 TjT 2 (Tj is with respect to T|., see App. B.l) to the integral I [p^'p^ 2 , n] . Evaluating the 
(non-inverse) T transforms yields the factorable form 

i [p »] = TrVJdpOi,. ( „, x <n iS0iiP :'e-»". + '=>) 

x (n. . ,p^' Pih ) x ( n * „ p"'). 

(See App. C.l for justification of the interchange of the integral over p and the T transform.) Now, 
write the transformed integral as the convolution (see Thm. 1) 

®(® ie a 2 _ 12 P" ,e " P ' t2 )®(®i,a I + 2 P" 1 ))(l). 
Apply Thm. 9.1 and induction to find (where |3 = |3 - $ l 2 ) 

n. „ r(v ) 

Similarly, use Thm. 9.2 and induction to find 

((® ieo PN" P,t ')®(® ieo P : i e" Mt ' + t2) )®(® iea P r i e- Pit2 ))(T) = 

n ' e °' r<V,) (p P '- 12 " 1 e- ptl ®p Pl2 " 1 e- p(t ' + t2) ®p P - 12 " 1 e- pt2 )(x). 



np 1 . 12 )np 12 )np 2 _ 12 ) 

Substitute the result for Thm. 1 1 into the triple convolution above, and substitute the last two ex- 
pressions into the convolution form of the transformed integral to find 

I[pW,n] =T7 1 T" 2 [ - 

1 2 12 np 1+2 )np) 

x((p p '-- 1 xe- p(tl + t2) x 1 10 F 001 [P 1 _ 12 ,P 2 _ 12 ;P 1 + 2 ;t 2 p,t lP ])®p p - 1 ) ](1). 
Now, take inverse T transforms and apply Thm. 9.1 to find the desired result. Refer to App. E to 
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determine the conditions for the existence of the identity. Refer to App. D for the continuation of 
the result to Re (r|.) > 0. Finally, for values of T] > 0, refer to App. F. QED. 



See footnote 2 for a derivation of two interesting identities resulting from alternate forms of 
this proof. When GjCZGj the above result simplifies as in Thm. 14b below. 



Theorem 14b. If a v c 2 satisfy G J2 * 0, o 2 czc l , Re (Pj +r|) > 0, Re ((3 12 + r| 2 ) > 0, 
and Re (V) > 0, i = 1, ...,m, then 

n n Y„ r(|3 1 + r|) 

(14b.l)I[p>^n] = ^^x^^x^^-r^l], 

nAU1 , _ y„ r(P 12 + T! 2 ) ro 1 + r,) 

(14b - 2) - np 12 ) x np 1+ Ti 2 ) x npH^y 



Proof: Similar to proof of Thm. 14a, but apply Thm. 10.1 instead of Thm. 1 1 . The second 
form (14b. 2) of the result is derived by applying Gauss's identity (see Footnote 1) to the first form 
of the result above. QED. 



Theorems 15a and 15b build upon the results of Thms. 14a and 14b respectively and state re- 
sults needed to express specific terms of the various Bayes estimators. Theorem 15a contains re- 
sults for the non-contained overlap case. ("Non-contained" means that neither subset of indices is 
properly contained within the other.) Thm. 15b contains results for the contained overlap case. 
Since we are most directly interested in non-negative integer r\ 's and because simplification occurs 
at those r\ 's, Thm. 15a is stated only for non-negative integer rj 's. 

Theorem 15a. If T|j > and rj 2 > are integers and the conditions for Thm. 14a hold, then 

n<o n t r n ^i n ^2 „n ^(0)^(00) 
(15a.l) I LPj p 2 , nj = C F 
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(15a.2) I[p^log( Pl )pJ,n] = c (1) F (00) + C (0) F (10) , 
(15a.3) Itp^logCp^pJlogCp^.n] = 

c (2)p(00) +C (1) rp(10) + p( 01 )j +C (0)p(H) 

(15a.4) I[p>g2( Pl )pf,„] = c^F (00) + 2C (1) F (10) + C (0) F (20) , 
where: 

C -f(p\7 2 ) X r(P + Ti) ' C _C A ^ (P 1+2 + ri,p + ri), 

C (2) =C (0) x {A^ (1) (P 1+2 + T],p + Ti) 2 + Aa> (2) (P 1+2 + Ti,p + Ti)}, 
and 

(00) ti n ^1-12^^2-12^ 1 

'i '2 i = o j = o (P 1 + 2 ) (lit -J) ! (Tl 2 -i) ! 

u(10) , ,v1 2 v- (f3 l-12 ) i ( f 3 2-12) j 1 (-1)' r x 

with Qj given by 

+ ea-ri 1 -i)(-i) Tli + 1 ra-ri 1 ). 

(c) F (01) isthe same as (b) with i <-> j and T| ^ <-> T| 2 . 

QQ . (Pi-l 2 ) i (P 2 -l2) j l 

(d) f (11) = V^i-o^o — ( b i+2 ) - ^ QiO^^QiCiV- 

(20) ti oo (|3 l-12 ) i ( l 3 2-12) j 1 (-1) 1 

(e) F^ VV Z^ ^p-^- J (ll2 _ i)| ^fQ 2 0.Tl 1 ), 
with Q 2 given by 
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Q 2 q.r, 1 ) Sl (i-9q-n 1 -i))^4.-,U.,, MTli _ r) 1 (Tli _ s) 

+ 9(j -r,, - 1) + '2ru -r,,)S r - = ' , r ^ i ^l • 



Proof: The proof is done for (15a. 2). The other cases have similar proofs. Differentiate 
both sides of the expression for I [p 1 'p 2 2 , n] given in Thm. 14a with respect to T[ 1 . After differ- 



entiating, the left-hand side is given by II Pj log (p t ) P 2 ' 2 , nj . (The justification of the interchange 
of the integral and derivative is given in App. C.l.) Write the differentiated right-hand side as 

V C(Q) ><2 >2 ,0 F 0,0,l[(Pl-12'- T l2)'(P 2 -12'- T ll);Pl + 2;l J l])- 

This expands to 3 (C (0) ) 2 2 F Q Q1 [...]+C {0) B 2 F Q Q ^ [..,]). The derivative of C (0) 
is given by d„ C (0) = C (0) x A4> (1) (p\ ^ 9 + r\, (3 + r|) = C (1) . The undifferentiated hypergeo- 

metric is evaluated at r\ j and r\ 2 using the results in App. F, cases 1 and 2. This evaluates to F '°°' ) 

defined in (a) above. The derivative of the hypergeometric may be taken term-by-term (this is jus- 
tified below). Use the results in App. F, cases 1 and 2, Eqs. (F.4) and (F.6), to evaluate this deriv- 
ative at r\ l and r\ T Doing this gives the expression F (10) defined in (b). With these derivatives 
and evaluations, the result (15a.2) follows immediately. Now consider the validity of term-by-term 
differentiation of the hypergeometric. There exists a closed neighborhood N containing the integer 
T] j with Re(|3 j + x) > Vx e N . The results of App. F show that any truncation (in j ) of the series 

for 2, 2, o F o, o, l t ( P l - 12' _r l i) ' ( P2 - 12' -x ) ; P 1 + 2 ; 1 ' 1 ] ( see A PP- A ) ma y be differentiated with 
respect to x on N . The sequence of derivatives of the increasing order truncations converges uni- 

r (P 2 _ 12 + j) T(-x+j) 

formly on N. (To see this, note that Sj(x) =L. +l =^0 ■ — ^ n is convergent for 

J ^1 l(Pj + 2 + i-l-j) j. 
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each i, and Re(Pj + x) > 0. Now, note that S^x) is a series of terms each monotonic on N with 

the same monotonicity in x holding for each term, and that the summation over i in (b) is finite. 
These observations and the convergence just established demonstrate the claim of uniform conver- 
gence.) Finally, by Thm. 7.17 of [10], the sequence of derivatives of the increasing order trunca- 
tions converges to the derivative of the limit of the series on N, justifying the term-by-term differ- 
entiation of the infinite series. QED. 

See footnote 3 for some comments regarding alternate forms for the results given above in 
Thm. 15 a. 

Theorem 15b builds on Thm. 14b and states the results for the case in which there are two sub- 
set sums, with the indices of one subset completely contained in the other. Here, unlike in Thm. 
15a, there is no hypergeometric function to consider, so the presentation of these results is much 
shorter. Further, unlike Thm. 15a, the expressions given are valid for all r\ 's in the range specified 
(not just at nonnegative integers as in Thm. 15a) because there are no poles in the expressions being 
considered at the integers and therefore no further simplification occurs at these points. 

Theorem 15b. If the conditions for Thm. 14b hold, then 
(15b.l)l[p^log( Pl )p^,n] = C (00) xA4> (1) ((3 1 + ri,|3 + ri), 
(15b.2) 

I[p>^log(p 2 ),n] = C (00) x {AO» (1 >(P 12 + Ti 2 ,p 1 + Ti 2 ) + A^ (1) (P 1 +Ti j p + Ti)}, 
(15b.3)l[p i ; i log(p 1 )p^log(p 2 ),n] = C (00) x {A$ (1) (p 1+ ri,p + ri) 2 

+ A4> (1) (P 12 + ri 2 ,P 1 +ri 2 )A4> (1) (P 1 +ri,p + ri) + A4> (2) (P 1 + ri,p + ri) }, 

(15b.4) 

Itp^log^p^n] = C (00) x {A^ (1) (P 1 + T],P + ti) 2 + A«) (2) (P 1 + ti,P + ti)} 
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(15b.5) 

I[p>^log 2 (p 2 ),n] = C (00) x {(A$ (1) (P 12 + Ti 2 ,P 1 +Ti 2 ) + A^ (1) (P 1 +Ti i p + Ti)) 2 

+ A4> (2) (P 12 + y\ 2 , p, + ri 2 ) + A4> (2) (P t + r\, p + ri) } , 

. ,,(00) Yn r (P 12 + Tl 2 ) r(P 1 + T!) 

whereC - np 12 ) x np 1+ Ti 2 ) x npn^y 



Proof: The proof is done for (15b. 2). The proofs of the other results follow in a similar 
manner. The result of Thm. 14b is 

Tr % ^ , _ r (00) _ Y„ r(P 12 + T! 2 ) r(P 1+ Ti) 

iL Pl p 2 ,nj -l - fxp^x np 1+ Ti 2 ) x npH^y 

Differentiate both sides of this with respect to T| 2 . The left-hand side of the differentiated expres- 

Tl Tl 

sion is I [Pj 'p 2 2 log(p 2 ), n] . (The justification of the interchange of the integral and derivative is 

r(p 12 + ii 2 ) 

given in App. C.2.) The derivative of r- is given by A<J> 1 ' (P 12 + r\ 2 , P + T| 2 ) . The de- 

rivative of - ^ is given by A4> v ; (p l + r\, p + r\) . Substituting these expressions for the ap- 
propriate derivatives in the overall derivative of the right-hand side of the equality above for C ^ 00 ^ 
gives the claimed result. QED. 



2d. BAYES ESTIMATORS FOR NON-OVERLAP AND PAIR- WISE OVERLAP TERMS 

In this subsection we present those Bayes estimators for the functions of Sec. 1 that can be ex- 
pressed with non-overlapping and pair- wise overlapping terms. These include the first and second 
posterior moments of the entropy, mutual information, average, and variance, and the first poste- 
rior moments of chi-squared and covariance. The second posterior moments of chi-squared and co- 
variance appear in Sec. 2g. All of the results presented in this section assume a uniform prior. 
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1. Entropy, S(p) = -E^log (pj) . This result appears in [1] in a different form and is stated 
here for completeness. 

Theorem 16. If Re (v ; ) > 0, i = 1, m then 

(16.1) E[S(p)ln] = -E^AfcWCv.+ l.v + l), 

(16.2) E[S 2 (p)ln] = 

v. v. 

Z. . - 1 ] —r x {A4> (1) (v. + l,v + 2)A4> (1) (v.,v + 2)-4> (2) (v + 2)} 
1 *)v (V + 1) 1 J 

v i (v i +1) m 2 m 

+ L. * x {A4> (1) (v. + 2,v + 2) +A4> (2) (v. + 2,v + 2)} . 

i V (v+1) 1 1 



2. Mutual information, M(p) = X^.pylog 



matrix. Define v.. = V. +V.-V... 

ij i- J ij 



( Pii ^ 



vPiP-jy 



. In this case the observed counts form a 



Theorem 17. If the Vjj are non-negative integers V ij (the integer condition is used only in the 
simplification of the IN term in (17.2); for the other terms it may be relaxed) then 
(17.1) E[M(p)ln] = D-I-Jwhere 

IJ = E[E ijPlj log(p 1J )ln] = e3a4> (1) (v ij+ 1,v+1), 

v. 

I = E[E iPi .log( Pi .)]n] = L.~A<D (1) (v.. + l,v + l), 

v . 

J = E[p. j log(p. J )ln] = Z j - J A$ (1) (v. j +l,v+l),and 



(17.2) E[M z (p)ln] = IJMN + IM + JN-2(IJM + IJN-IN) where 
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IJMN = EfE^^logCp^p^logCp^ln] = 

v..v 

Z..Z / J m " x {A4> (1) (v.. + l,v + 2)A4> (1) (v + l,v + 2)-cJ> (2) (v + 2)} 

+ Z u v(v+l) X {A * ( v ij + 2 ' v + 2 ) +A4> (2) (v ij + 2,v + 2)} 

IM = E[E im p..log(p..)p m .log(p m .)lii] = 
v. V 

E.Z . /' m :. x {A3> (1) (v. + l,v + 2)A^> (1) (v + l,v + 2)-4> (2) (v + 2)} 

V. (V. +1) . 

+ Z. -———r x {A4> (1) (v. +2,v + 2) +A4> (2) (v. +2,v + 2)} 
1 v (v+ 1) »• J - 

(To find JN substitute v . for v. and v for v in the expression for IM.) 



DM = E[Z..Z m p ij log(p ij )p m .log(p m .) I n] = 
v.v 

^. / J m 1N x {A^> (l) (v..+ l,v + 2)A4> (l) (v + l, v + 2) - 4> (2) (v + 2)} 

v..(v. +1) , 

+ £.. 1J ' - s x {A4> (1) (v. +2,v + 2) +A$ (1) (v..+ l,v. +l)A4> (1) (v. +2,v + 2) 
y v (v + 1) v i- ' ij i- v i- 



+ A$ (2) (v i +2,v + 2) } 



(To find UN substitute v for V m in the expression for IJM.) 
IN = E[E i Z n p,log(p,)p. n log(p. n ) I n] = 
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v. (v. + 1) ... ? 

E. '" x { (A^> (1) (v. +2,v + 2) +A4> (2) (v. +2,v + 2)) 

in v (V + 1) ln ln 



v. +v -2v. (v. -v. ) (v -v. ) 

l- ■ n in v i- in' v ■ n in' , 

x{1 - v + vTVTT) } 



+ A$ (1) (v. +2,v + 2)xE°° n — —. — 

v in ' ' i = j-t 



(v -v. ) 

v n in 7 



(v in ) r 



( V. -V. A 
1 + 



v V. +r , 

v in / 



(V. -V. ) 



+ 



i- in' 



(V. ) 

v in' r 



f V -V. A 



■ n m 



1 + ^ 

v. +r 



r = s = 



(Vi.-V/v.n-V.Q^r, 1) Qi(s, 1) 



(v. ) 



where Qj is defined in Thm. 15a. 



Proof: Write M(p) as M(p) = E^log (p y ) - 2^. log (p..) - SjP.jlog (p.j) . 

To prove (17.1) write E [M(p)ln] = IJ — I - J, where U = E [E.-pylog ( Pij ) I n] etc. Recall 
that for the uniform prior E [F(p) In] = I [F(p), n] /I [ 1, n] , and apply Thm. 13.1 with k = 1, 
T| u = 1 , p u = py, p u = pj, p u = p j to simplify the numerators in the expressions I J, I, J respec- 

tively. From Thm. 3, I [ 1, n] = ^ . The results follow by substitution. 

To prove (17.2) first square M(p). In a manner similar to the proof of (17.1), find 

E [M 2 (p)l n] = IJMN + IM + JN - 2 (DM + UN - IN) , where each term may be expressed as 
a ratio of I [ ■ , ■ ] integrals, with I [ 1, n] in the denominator. The rest of this proof outlines the cal- 
culation of the numerator terms in these ratios. 

Apply Thm. 13.2 with k = 2, T] u = 1, T| v = 1, P u = Py. P v = P mn to find the ij ^ mn terms 

of I [IJMNIn] , and apply Thm. 13.3 with k = 1, T| u = 2, p u = p- to find the ij = mn terms 

of I[IJMNIn] . 
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Apply Thm. 13.2 with k = 2, r| u = 1, ri v = 1 , p = p ; . , p = p m to find the i ^ m terms of 
I[IMIn] , and apply Thm. 13.3 with k = 1, TJ = 2, p = pj, to find the i = m terms of 

I[IMIn] . Tofind i [jNIn] substitute v for v., and v for v in the expression for I [IM I n] . 

Apply Thm. 13.2 with k = 2, T| u = 1, T| v = 1 , p u = p-, p v = p m . to find the i * m terms of 
I [IJMIn] , and apply Thm. 15b. 3 with r| 1 = 1, T| 2 = 1 and p 1 = p i , p 2 = p^ to find the i = m 



terms of I [IJMIn] . To find I [IJNIn] substitute v for V m in the expression for I [IJMIn] . 
Apply Thm. 15a.3 with r|j = 1, r\ 2 = 1 and Pj = p^, p 2 = p. n to find I [INIn] . QED. 

3) Average A(p) = Z™ = 1 p i X i . (Note that for X; = 8 ij; (1) of Thm. 18 below gives the 
Laplace Law of Succession estimator for pj .) 



Theorem 18. If Re (v.) > Vi then 
(18.1) E[A(p)ln] = Z.^Xj. 



v v. v. (v. + 1) „ 

(18.2) E[A 2 (p)ln] = £. . - 1 J - X-X- + Z. 1 ' — Xf 
v '*Jv(v+l) 1 J x v(v+l) 1 



4) Variance V(p) = EP ^ (X { - y, x ) 2 . 

Note that E [V(p) In] #E[ (A(p) - E [A(p) I n] ) 2 I n] ; (i x is the true mean, not the ex- 
pected mean, and V(p) refers to the true variance, not the variance in the estimator E [A(p) I n] . 

Theorem 19. If Re (v ; ) > Vi then 

v. (v-v.) . v.v. 

(19.1) E[V(p)ln] = Z.—, ^-Xf-Z. . - 1 J 1X X-X-. 

v 1 v(v+ 1) 1 '*Jv(v + 1) 1 J 
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(19.2) 



E [ V 2 (p) I n] = E..E [p iPj I n] Xfxj - 2Z ijk E [p lPj p k I n] X 2 XjX k 



+ S 1J ki E [PiP J PkPiln]X i X J X k X 1 , 
where the expectations are found by applying Thm. 12. 

5) Covariance C(p) = Z ijPij (Xj - n x ) (Yj - n y ) . 



Theorem 20. If Re (v. .) > V ij then 

E[C(p) I n] = S..X-Y. . 1 [W..-V. v .] 
J y 1 Jv (v + 1) y !■ j 

found by applying Thm. 14a. 



The second posterior moment of C(p) is given in Thm. 26. 



*- (Pij-Pi-P-p 



6) Chi-Squared % z (p) = L 



1J Pi.P.j 

Theorem 21. If Re(v r ) >0 Vij, Re (Vj ) >-l,and Re (v ) >-l then 

(v-1) (v-2) 



E[X z (p)ln] = -1+E 



ij (V. + V . - V.. + 1) (V. + V . - V..) 



(v. -v..) (v . - v..) 
j,oo ^,00 y m 'j y n 

m = n = ( V . + v . _ v .. + 2) 

!■ J y m + n 



found by applying Thm. 14a. 



The second posterior moment of % 2 (p) is given in Thm. 27. 
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2e. NOTATION FOR MULTIPLE OVERLAP INTEGRALS. 

The calculation of multiple overlap integrals is not straightforward. The process can be sum- 
marized as follows. First T transforms are applied and then Thms. 9.1 and 9.2 are used, leaving a 

convolution product of m terms. Each term in this product has the form x ak n?_ { e %&iti , 
k = 1, m, with the a^s taking values in {0, 1} , the t-'s the T transform variables, x the con- 
volution variable, and the a k 's constants (see Thm. 14a for an example). The upper index on the 

product, n , is determined by the overlap structure. 

Since the convolution operation is both commutative and associative, the convolutions may be 
done in any order. However, if the convolutions are done in random order, it is quite easy to arrive 
at an expression for which the inverse T transforms cannot be evaluated in closed form. Further, 
given any particular ordering of the convolution operations, much bookkeeping needs to be done 
to actually find the result. Thus, it is important to have a method for quickly determining the salient 
aspects of any convolution ordering (whether or not the particular order chosen for the convolu- 
tions leads to an invertible expression), without actually having to do the convolutions. 

In order to facilitate this, Convolution Form (CF) notation is now introduced. This notation 
captures the relevant aspects of the expressions involved and provides a guide for the rapid calcu- 
lation of multiple overlap convolution integrals. First, we show how to translate convolution prod- 
uct terms into this notation. Then we use the CF notation to state the relevant algebraic properties 
of convolutions. It is not expected that the justifications for these properties should be transparent 
to the reader. In fact, it is not even expected that the reader will have a complete and formal under- 
standing of CF notation; in the interests of simplicity, CF's are presented here simply as a useful 
algebraic framework for performing calculations. The reader interested in the formal details is di- 
rected to App. G. 

Write each expression x"* II? _ Xa,t ' appearing in the convolution product (the T trans- 
formed integral mentioned two paragraphs ago) in CF notation as the CF (a;0) , where a is the n- 
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vector of constants a i? and where is the zero n -vector. In writing (a, 0) , the t, x and oc k depen- 
dence is assumed. 

The second component b of the general CF (a;b) may be nonzero (see App. G). However, 
for the moment there is no need to present the full definition of the CF' s encompassing nonzero b ; 
for now b should be taken simply as an algebraic object with certain useful properties (these prop- 
erties are formally justified in App. G). The reader should be aware though that in general, for non- 
zero b, (a;b) represents a set of expressions; i.e. (a;b) does not uniquely specify a single ex- 
pression. (I.e., it does not uniquely specify a function of t, x and the a^; see App. G.) 

We represent the convolution of two CF's using brackets, so that the convolution of the CF's 
(a;b) and (c;d) is given by [(a;b), (c;d)] . The general algebraic rule for translating this con- 
volution into a CF is 

[ (a;b) , (c;d) ] c (c ; b v d v nz(c - a)) , (13) 
where: i) "v" is the vector-or operator with [a v b] . = 1 if a i = 1 or bj = 1,0 otherwise; 

ii) "-" is the vector-difference operator with [a-cjj = ^ - c i5 and iii) "nz" is the vector-valued 
nonzero operator, with [nz(a)] ; = 1 if a { ^ 0, otherwise. We use the symbol "<z" because Eq. 
(13) is a relationship between two sets of expressions; it means that the convolution product may 
be written in the form (c ; bvdvnz(c-a)) (see App. G). 

Now we state the algebra of CF's under inverse T transforms. Let (a ; b) be any CF with 
aj = = b i5 for some i e { 1, m} . Further, let c ^ 0. Then we have 

-^[(cei + a; b)(x)]c(a; b) , and T7 1 [ (c^ + a ; ei + b)(x)]c (a ; b) , (14) 
where the inverse transform is with respect to x, and [ej . = 5.., the Kronecker-delta function. 

In the theorems that follow, any vector occurring as an argument of a CF has all of its compo- 
nents equal to either or 1 . Such vectors will be represented by writing a shorthand list of the in- 
dices of the non-zero values only. For example, the vector (1,0, 1, 1) will be represented by the 
list { 1, 3, 4} . Finally, the { } 's will be dropped from the lists since they are burdensome, and 
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empty lists will be represented with the dash symbol The result of a successful ordering of con- 
volutions and inversions is the CF (-;-), i.e. our goal is to find an ordering of the convolution op- 
erations such that the repeated applications of Eqs. (13) and (14) result in (-;-) . (See App. G for 
the justification of this being the condition that allows for the closed form evaluation of the inverse 
T transforms.) 

2f. MULTIPLE OVERLAP INTEGRALS. 

The CF identities (Sec. 2e, Eqs. (13) and (14)) allow the efficient calculation of the needed mul- 
tiple overlap integrals, as we demonstrate in Thms. 22 and 23. Theorems 24 and 25 apply Thms. 
22 and 23 to evaluate the integrals appearing in the Bayes estimators for the second moments of 
covariance and chi-squared. In section 2g we use these results to write closed form expressions for 
these moments. 

Theorem 22 concerns convolutions in which there are three subsets of indices o v o 2 > °3 w ^ 
i) G 12 # and G 23 ^ 0, ii) o 13 = 0, and iii) none of the intersections are equal to any of the sub- 
sets Oj, o 2 , o 3 . Theorem 23 concerns convolutions in which there are four subsets of indices, 
where i) the only non-empty o r with i ^ j are a 12 , o 2y o 34 , c 41 , and where ii) none of the inter- 
sections are equal to any of the subsets o r o 2 , o y G 4 . 

Overlap structures of these types occur in the integrals for the second moments of covariance 
and chi-squared. Many mathematically equivalent forms of the results of Thms. 22 and 23 are pos- 
sible; we present the result of one particular calculation in each case and note that equating differ- 
ent forms leads to hypergeometric identities along the lines of those discussed in footnotes 1 and 2. 

In what follows "Terms 1 and 12", for example, indicates that the terms e tlP p a ' 1 and 

-(t 1+ t 2 ) a u -i * j ~. , , , . , r> , r(a)r((3)r(y) 

e p are being convolved. Define the shorthand notations (a, p, y) = — — 5 r-, 

F & K ' T(a + (3 + y) 

r(a)r(B) T(a + i) 

(a ' ^ ~ r(a + p) ' As in App ' A ' the symbo1 (a) i s r(a) ' 
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Theorem 22 states the result of a derivation that treats subsets 1 and 3 symmetrically from the 
outset. For future use, in Thm. 22 the expression being evaluated is defined simply as I 3 , suppress- 
ing the various a and r\ arguments. (The inverse T transforms turn t's into rj 's.) 

Theorem 22 . If Re ((Xj) > 0, Re (a 12 ) > 0, Re (a 2 ) > 0, Re (a 23 ) > 0, Re (a 3 ) > 0, 
and a = a 1 + a 12 + a 2 + a 23 + a 3 , then 

I 3 = T-'T-'T" 1 [e- tlP p a ' " 1 ® e" (t ' + t2) V 12 " 1 ® e" t2P p a2 " 1 

0e -(t 2 + t 3 )p p (a 23 -i) (g)e -t3P p a 3 -i ] 

= x a+11 " 1 (ccp a 12 ) (a 3 , a 23 ) (a l + a 12 + r\ v a 3 + a 23 + ri 3 , a 2 ) 

x 2,2, i F i,i,i [(«i>«i + «i2 + r li) • (a 3 , a 3 + a 23 + ri 3 ),-ri 2 ; 

a 1 + a 12 , a 3 + a 23 , a + r\ 1 + r\ 3 ;l, 1 ] 

Proof: The following derivation applies Eqs. (13) and (14) repetitively to demonstrate that a 
particular ordering of the convolutions results in the sucessfully inverted CF (-;- ) . 



Terms 1 and 12: 


[(1;-), d,2;-)] 


c=(l,2;2) 






^[(1,2:2)] 


c(2;2) 


(a) 


Terms 3 and 23: 


[(3;-), (2,3;-)] 


c(2,3;2) 






T^[ (2,3;2)] 


c(2;2) 


(b) 


Results (a) and (b): 


[(2;2), (2;2)] 


c(2;2) 


(c) 


Res. (c) and term 2: 


[(2;2), (2;-)] 


c(2;2) 


(d) 


Invert (d): 


T 2 1 [(2;2)] 


c (-;-)• 





Now, carry out the convolutions and inverses in the above order with the appropriate terms substi- 
tuted as indicated in the left column to find 
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Ia = x a + ri - 1 £ - qE oo = q _1_ ( _ v i+j (ttl + i, a 12 ) (a 3 + j, a 23 ) 

x (aj + a^ + i + rij, a 3 + a 23 +j + ri 3 , a 2 ) . 
Rewriting this using hypergeometric notation (see App. A) gives the desired result. QED. 

Theorem 23 presents the results for the relevant quadruple overlap sum convolution. As in 

r(a)r(P)r(Y) n nconp) 

Thm. 22, here we use the notation (a, p, v) = ^? o r- and (a, p) = ^- . In order to 

K " r(a + (3 + y) v T(a + |3) 

simplify the resulting expressions, we do not express the result in hypergeometric notation. For fu- 
ture use, in Thm. 23 the expression being evaluated is defined simply as I 4 , suppressing the various 
a and r\ arguments. 

Theorem 23 . If Re (a { ) > 0, Re (a J2 ) > 0, Re (a 2 ) > 0, Re (a 23 ) > 0, Re (a 3 ) > 0, 
Re (cx 34 ) > 0, Re (a 4 ) > 0, and a = a x + a n + a 2 + a 23 + a 3 + a 34 + a 4 + a 41 then 

I 4 - T-%% l T- 1 [e- tlP p a ' " 1 ® e" (t ' + 12) V 12 " 1 ® e^p" 2 " 1 

@ e - (t 2 + 1 3 ) P p a 23 " 1 e -t 3 P p « 3 " 1 @ e - (t 3 + 1 4 ) Pp a 34 - 1 @ e -t 4Pp a 4 - 1 ] 

= X (X + 1 1- 1 L? . (_l)i+j + m + n 1 ( ) ( } 

i,j,m,n,p,q = v ' l!j!m!n!p!q! 2 j + m + p '4'i + n + q 

x (a 41 + i, a v a 12 + j) (a 23 + m, a 3 , a 34 + n) 

x (a 41 + a x + a n + r\ 1 + i + j, a 23 + a 3 + a 34 + ri 3 + m-i- n) 

x (a-o^-o^ + rij + rij + i + j + m + n + p, a 2 ) 

x (a-a 4 + rj 1 + rj 2 + r| 3 + i + n + q, a 4 ) . 

Proof: The following derivation applies Eqs. (13) and (14) repetitively to demonstrate that a 
particular ordering of the convolutions results in the successfully inverted CF (-;-). Initially, con- 
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volutions are done on two non-overlapping groups - the group depending on tj and the group de- 
pending on t 3 . Then the inverses T^ 1 and T 3 ! are taken and the remaining convolutions are done 
with care taken to create expressions that have simple T transform inverses. 
For the tj group (terms 41, 1, 12), 

Terms 41 and 1: [ (4, 1;-) , (1;-) ] c(l;4) (a) 
Term 12 and Res. (a): [ ( 1, 2;- ) , ( 1 ;4) ] c ( 1 ;2, 4) 

T7 1 [(l;2,4)] c(-;2,4) (b) 

Similarly for the t 3 group (terms 23, 3, 34): c (- ;2, 4) (c) 

Res. (b) and (c): [ (- ;2, 4) , (- ;2, 4) ] c (- ;2, 4) (d) 

Res. (d) and term 2: [ (- ;2, 4) , (2;-) ] c (2;2, 4) 

T^[(2;2,4)] c(-;4) (e) 

Res. (e) and term 4: [ (- ;4) , (4;-) ] c (4;4) 

T- l [(4;4)] c (-;-)• 

Now, carry out the convolutions and inverses in the above order with the appropriate terms sub- 
stituted as indicated in the left column to find the desired result. QED. 

Theorems 24 and 25 utilize Thms. 22 and 23 respectively to find the needed multiple overlap 
integrals. They are given without proof, as they follow immediately from these theorems, 
Thm. 9.1, and induction. 

Theorem 24 . Assume the overlap structure relevant to Thm. 22, with C 3 defined as the 
x-independent factor of the I 3 defined in Thm. 22. (I.e., I 3 = % a + ri ~ J C 3 .) Define 
P = P-P 1 + 2 + 3 .If Re((3 u + ri u ) >0 for u = 1,2, 3, then 
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Itp^p^n] = r(|3 + ri) -xC 3 [(3 1 _ 12 ,|3 12) |3 2 _ 12 _ 23) (3 23) (3 3 _ 23 ;ri 1) ri 2) ri 3 ] 



Theorem 25 . Assume the overlap structure relevant to Thm. 23, with C 4 defined as the 
T -independent factor of the I 4 defined in Thm. 23. (I.e., I 4 = x a + T1 ~ 1 C 4 .) Define 
P = p-|3 1 + 2 + 3 + 4 .IfRe(|3 u + r| u ) >0foru = 1,2, 3,4, then 

i[p7'p>; s pj«,n] = 

x C 4 [(3j _ 12 , (3 12 , |3 2 _ 12 _ 23 , P 23 , P 3 _ 23 _34> P 34 > P4_34_4i> ^41^1' ^2' ^3' - 

2g. BAYES ESTIMATORS - MULTIPLE OVERLAP TERMS. 

In this subsection we complete the presentation of the results for the Bayes estimators (see 
Sec. 2d) by giving the Bayes estimators with uniform prior for the second powers of covariance 
and chi-squared. Since the complete description of these results is quite lengthy, we present them 
in recipe form; at this point the reader should be able to make the needed substitutions. 

5) Covariance C(p) = E ijPij (X { - \lJ (Y i - \i y ) . 
Theorem 26. If Re (v..) > Vij then 

E [C 2 (p) I n] = Z ijkl E [p ijPkl I n] X l Y J X k Y 1 - 2E ijk ,E [p ijPk . p,ln] X^Y, 
+ E ijkl E[p i .p. jPk .p. 1 ln]X i Y j X k Y 1 , 

where E [PyP kl I n] is found by applying Thm. 12, E[pjjP k . p., I n] is found by applying 

Thm. 14a, and E [pj. p. jP k . p. j I n] is either a single, double, or quadruple overlap term found by 

applying Thms. 14a, 24, and 25: if i = k,j = 1, then apply Thm. 14a; if i = k, j^l or 
i ? k, j = 1, then apply Thm. 24; if i ? k, j ? 1 then apply Thm. 25. 
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6) Chi-Squared % 2 (p) = £ 



(Pij-PiPj) 
;; Pi-P.i 



Theorem 21. If Re (v.p >0 Vij, Re (v. .) >-l and Re (v >-l then 

-2E[x 2 (p) I n] -1, 



E[(X Z (P)) In] = E ijkl E 



2 2 
PijPkl 



I n 



Pi.P.jPkPi 



where E [% (p) I n] is given in Thm. 21 and where E 



2 2 
PijPkl 



I n 



is either a single, double, 



Pi.P.jPkPi 

or quadruple overlap term found by applying Thms. 14a, 24, and 25: if i = k, j = 1, then apply 
Thm. 14a; if i = k, j ^ 1 or i ^ k, j = 1, then apply Thm. 24; if i ■£ k, j ■£ 1 then apply Thm. 25. 



3. EXTENSION OF THE CLASS OF CALCULABLE PRIORS 

The calculations of this paper and [1] are done under the assumption of a uniform prior P (p) . 
However they also apply essentially unchanged when certain other priors are used. Here we briefly 
discuss applying the calculations of these papers to cases where the prior is not uniform. As spe- 
cific examples of how the calculations of these papers are modified when the prior is not uniform, 

we consider priors of the form P(p) ACp)]!! 1 ^ jp[' (the Dirichlet priors are the subset of these 

with all r } equal) and the entropic prior P(p) A(p) e aS(p \ 

Define P(p) implicitly through P(p) = A(p)P(p) . (P(p) is the non-delta function part of P(p)). 
The uniform prior has a constant P(p) . Even if P(p) is not uniform, it is often the case that 

I [F(p)P(p), n] = JdpF(p)P(p)n™ lP "' and I [P(p), n] = JdpP(p)n^ = lP "' are of the form of 
integrals evaluated in these papers. (In this section all integrals are over p 's with nonnegative com- 
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ponents). In these cases we can evaluate the Bayes estimator for F(p) with prior P(p) since 
E [F(p) I n] = I [F(p)P(p), n] /I [P(p), n] . 

For example, the Bayes estimator for F(p) with prior P(p) can often be evaluated using the cal- 
culations of these papers when P(p) A(p)ni T1 = jfjCPj). In particular consider priors of the form 

r 

P(p) A(p)Tl™ = 1 p i 1 with Re (rj) >-l, i = 1, ...,m. (Dirichlet priors have r ; = r for all i.) 

When P(p) is of this form the Bayes estimator for F(p) with prior P(p) is given by 

I [F(p), n + r] 

E [F(p) I n] = In P n + r] . (15) 

As another example, consider the class of priors with P(p) represented by a taylor series con- 
verging everywhere in the domain of p. Using this taylor series representation expand both 
I [F(p)P(p), n] and I [P(p), n] into infinite sums of integrals. If all of the integrals are of a form 
evaluated in this paper the we can find E [F(p) I n] for the prior P(p) . 

For example, consider the entropic prior P(p) A(p) e aS(p) . In some applications the entropy 
S(p) is taken to be S(p) = -£™ = jPjlog (Pj) , where a is some constant, while in image processing 



( f Pi 

applications S(p) is often defined as S(p) = YJ^ = { (p l - - Pjlog 



V 



V m i 



where m is known 



as the "model" [1 1]. In either case, P(p) may be expanded in the series 

P(p) = E- =0 ^p, (16) 

Whenever the products S 1 (p)F(p), i = 0, 1, are of the form of some function integrated in 
these papers then closed form results (up to series truncations) for the Bayes estimator for F(p) 
with an entropic prior are available. An application of these ideas appears in [12], where they are 
used to calculate the normalization constant of the entropic prior 
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APPENDICES 



A. HYPERGEOMETRIC FUNCTION NOTATION 



Here we present the notation for the hypergeometric functions used in this paper. Paralleling 
Lebedev [13], let a and b be vectors of dimensions p and q respectively. Define 
(X) k = T(k + k)/T(X) . Define the single summation hypergeometrics p F q by 

_ r n «=i (a « ) iV i 

F [a;b;x] = Z°° _ -. (A.l) 

1 = ln ? = 1 (b p) Ji! 

An example of a single summation hypergeometric is J F 1 [a;(3;x] , which has the integral rep- 
resentation for (3 > a > [13] 

. F ' [a * fl = r(a)np-a) | dxe " x-'d-x)"— 1 ■ (A.2) 

Now, given vectors a 1 , a 2 , a 12 , b 1 , b 2 , b 12 of dimensions p l5 p 2 , p 12 , q 1; q 2 , q 12 respectively, de- 
fine the double summation hypergeometrics 

P,P,P„ F q, fe ,„l al > a2 - al2;bl > b2 - b ' 2;X ^2]- 

■(<),) (^.1(4,),) < n ^=i<) i+j ) 



CO yi OO 

(A.3) 



(1.^3 ^-l^j) ^J^W 



In writing arguments of F's, vectors will be denoted by a list of the elements, e.g., 
c = (Cj, . . ., c k ) . However, when listing the components of a 1-dimensional vector the parenthe- 
ses will be dropped. Further, when any of the p or q subscripts are zero (which corresponds to an 
empty argument for that position), the empty vector argument of the hypergeometric will simply 
be omitted from the list of arguments. 
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B. TRANSFORMS 



In App. B.l. we discuss the T transform. In App. B.2. we discuss the Z transform. 
B.l. THE T TRANSFORM 

In order to calculate averages of the form E [ (p 11 ) (log (p* 1 ) ) I n] (with p = S.pjj) and sim- 
ilar averages, the identity 

oo 

r(-ri) = J" u" 11 " 1 e" u du, Re(-r|) >0 (B.l) 

o 

for the gamma function is needed. (Here -T| has been used instead of T| in order to simplify the 
following.) Make the change of variables u = pt. With p > 0, independent of t, we find 

oo 

P 11 = f(=n)l t_11_1 e " Pt dt,Re(-T!) >0, p>0. (B.2) 

CO 

Define the operator T" 1 by T" 1 [F()] (r\) = f t" 11 " 1 F(t) dt, Re (-r|) > 0, and define the 

1( T])J 

transform T by T [T -1 [F ( ■ ) ] ] = F ( • ) . As defined, the transform T is closely related to the Mel- 
lin transform [14] (it is an inverse-Mellin transform) and we rely on this similarity to establish the 
conditions for the existence of the transform and its inverse. Of interest in this work is the follow- 
ing: For p independent of t, the functions p 11 and e pt form a transform pair. That is, 

T [p 11 ] (t) = e" pt and T" 1 [e" pt ] (r|) = p 11 . (B.3) 
As an example of the use of the transform T, when finding E [p* 1 I n] , where p is a sum of a 
subset of the p^s and Re (r|) < 0, the T transform will be taken of E [p 11 I n] with respect to T|, 

which substitutes e pt for p* 1 : 
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T [E [p 11 I n] ] (t) = T [jdp P(p I n) p 11 ] = jdp P(p ln)T[p"] = Jdp P(p I n) e" pt . 

(B.4) 

See App. C. 1 for the justification of the commutation of the integrals. After the transform has been 

done, often the last integral may be computed in closed form (see Sec. 2) and then the T -1 trans- 
form may be applied. 

B.2. THE Z TRANSFORM 

Let f(n) be any function that factors as f(n) = nP = ^(n^ . For such functions, the Z transform 

Z [f] (z) = E^ = Q f(n)z n is useful in simplifying calculations involving sums Z n f (n) , where the 
summation extends over all n having non-negative integer components and Z.nj = N. Define the 

discrete convolution product of two functions g and h by (g ® h) (n) = E? = Q g(i)h(n - i). (Note 

that ® is both commutative and associative, so that the order that the convolutions are taken in is 
irrelevant, justifying the use of the above notation when several functions are involved.) 

The Z transform convolution theorem may be thought of as a discretized form of the Laplace 
convolution theorem (see Thm. 2). 

Theorem B.l: If F(N) = E n f(n) and f(n) = II™ ^(n^ then F(N) = (®™ ^ ( N ) and 
Z [F] (z) = 11?^ jZ [fj (z), for all z such that Z [fj (z), i = 1, m, converges. 

Proof: For m = 2 we have F(N) = (f { ® f 2 ) (N) and the Z transforms of f 1 and f 2 are 
given by Z [fj (z) = Q f i (n)z 11 , i = 1,2, respectively. For z within the radii of convergence 
of both of these power series, we have (after collecting terms having the same power of z) 
Z [LJ (z) x Z [f 2 ] (z) = IT = z n Z^ = ^(i^n - i), 
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The right-hand side is immediately seen to be Z [F] (z) . The result for arbitrary m follows by in- 
duction. QED. 



Note that due to the uniqueness of power series representations, inverses of Z transforms exist 
on the nonnegative integers. 

C. COMMUTING LINEAR OPERATORS 

In App. C.l. we discuss the interchange of integrals. In App. C.2. we discuss the interchange 
of derivatives and integrals. 

C.l. COMMUTING TWO INTEGRALS 

Interchanging the integrals appearing in these papers as the p integral and the T transform in- 
tegral is possible due to Fubini's theorem [15], which justifies the interchange of uncoupled inte- 
grations (region of integration of either integral does not depend on the other integral' s parameters) 
when the double integral exists. 

C.2. COMMUTING INTEGRALS AND DERIVATIVES 



Consider differentiating the integral JF (x, t) dx with respect to t. Theorem C.l generalizes 
Thm. 9.42 of [10] and establishes conditions general enough to allow the commutation of the de- 
rivative and integral for the functions F(x, t) appearing in this paper. Define D 2 F (x, t) to be the 

partial derivative of F with respect to its second argument, evaluated at (x, t) . 
Theorem C.l : If (1) F(x, t) and D 9 F(x, t) are defined for (x, t) e A xA„ 
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where A x = (0, °°) , and where A t is convex, 

oo 

(2) |F(x, t)dx exists Vte A f 
o 

(3) Ve > and b > 0, 3 f (x) with f(x) > for x e A x , and 8 > 0, 3 

oo 

|f(x)dx < 8 and 

b 

Vx > b, Vs, t e A t , |t - s| < 8 => |D 2 F(x, t) - D 2 F(x, s)| < f(x) 

oo oo 

then D 2 jF(x, t)dx = Jd 2 F(x, t)dx on A x x A ( . 



F(x, t) — F(x, s) 

Proof: Let (|)(s, t) = ' ' — for s ^ t. By (1) and the mean value theorem, Vt > s 

t s 

with t, s g A t , 3 u(s, t) g [s, t] 3 (|)(s, t) = D 2 F(x, u(s, t)) . Using this and (3) we have that for 

oo 

any £>0,Vb>0,38>0 and a nowhere-negative (in A x ) f (x) obeying |f(x)dx < 8 such that if 

b 

t - s < 5 and x > b, then |<|>(s, t) - D 2 F (x, t) | = |D 2 F (x, u(s, t)) - D 2 F (x, t) | < f (x) . From 
this and (2) it follows that for all b > 0, 38 > 0, and a nowhere-negative (in A x ) f (x) obey- 

oo oo oo oo 

ing jf (x) dx < 8 such that if |t - s| < 8, then (s, t) dx - Jd 2 f ( x - *) dx - Jf ( x ) dx < e - 

b b b b 

oo oo 

Taking the limit s — > t, noting that lim f (]) (s, t) dx = D 2 mx, t)dx, and finally taking e — > 

s — > tJ J 

b b 

with b = 8, we arrive at the desired result. QED. 

The functions F(x, t) of interest in this paper have the form F(x, t) = x l log (x) m e~ cx with 
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Re (t) > -1 and c> . For these functions it may be shown that the conditions of Thm. C. 1 hold. 



D. EXPANDING ETA'S DOMAIN 

To apply the T transform, the assumption that all rj i < had to be made. Here we present a sim- 
ple theorem that expands the region of validity of the various expressions derived in this paper to 
the region where any of the r|. may be non-negative. We present the theorem for the single subset 

sum case only, although the multiple non-overlapping subset case and the contained overlap case 
may be handled in an almost identical manner. 

Theorem D.l: If Re (r| + (3^ >0, Re(r|) > and Re (n^ >-l, i = 1, ...,m then 
I[p ,n] - r(Pi) m + p) • 

Proof: Note that T| > implies that there is an integer q > and an ff < such that 
T| = ff + q. Thus I [p^, n] may be rewritten as 

Ifp^n] = E^Ifp^-U] =Z ieo I[p^ + £ i- 1 ,n + e 1 ], 
where [ej . = 8.. and 8 is the Kronecker delta function. Iterate this operation q times (removing 
one power from p and summing with an increased count vector each time) to find 

I[p\n] =L. eo ...L iqeo I[p^n + e ii + ... + e iq ]. 
Simplify this to yield 

I[p\n] = E^mp^n + q], 

where the vector q has nonnegative integer components summing to q with q { = for i g o. The 
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q r(q +1) 

symbol ( ) = is the usual multinomial coefficient. Since ri < 0, evaluate the in- 

^ n- =1 r(q i+ i) 

tegral I [p* 1 , n + q] using Thm. 12 with k = 1 (noting that (3 1 and (3 increase by q due to q being 

Y n r(v j + n + q) q q 

added to n) to find I [p\ n] = = rE ( ) Y n .„(*). Now, we put E f H ) y n + n 

into closed form by noting that it is the discrete convolution product of the functions of 
T(n i + q { + l)/T(q i + 1) of q { given by 

Z q ( q )Y n + q = r(q+l)[®?L I r(n i + q i +l)/r(q i +l)](q). 
Apply the Z transform convolution theorem (see App. B.2.) to find 

Z q ( q ) Y„ = Hq + DZ" 1 [II™ t Z [r(n. + q 4 + l)/r(qj + 1)] ] . 



Note that Z 



r(n- + q. + 1)1 , , 

r(q +1) (z) = T( ni + 1) (1 -z)" (n ' +1) for |z| < 1 and substitute for the Z 



q IXPx + q) 

transforms to find E ( ) Y n + a = Y n r ,n Substituting this result in (*) and simplifying 

q q q i (Pj) 

leads to the desired result. QED. 



We resort to analytic continuation in the non-contained overlap case. 

E. EXISTENCE CONDITIONS 

Here we present an example of a calculation for determining the conditions of existence of the 
various integrals I [ ■ I ■ ] appearing in these papers. For the integrands of these papers, existence of 
these integrals depends upon the behavior of the singularities appearing at the edges of the region 
of integration. 

Tl Tl 

Consider the single pair overlap intergral I [p l 'p 2 2 , n] , where p p p 2 , a J5 and a 2 are as in the 
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definitions for Thm. 14a, with the minor change that o l + 2 contains all m indices, which may be 
made without loss of generality. We show that the conditions for existence of this integral are 
Re (v\) > 0, i = 1, m, Re (Pj H-rij) > and Re (P 2 + r| 2 ) > 0. Write the integral as 

I [p>;>, n] = J dp A(p) 0(p) p>5 2 lf = iP "'. (E.l) 

The first condition, Re (Vj) > 0, i = 1, m, follows immediately from the fact that | dp i p" 1 
exists iff Re (Vj) > and the fact that any p { may independently be near zero for this particular 
overlap case. Now, either p l or p 2 may also be near zero. We consider the first case, in which p 1 
is near zero, and use symmetry to supply the result for the second case. Letting x = Z ; g o pj and 
y = Z. p- , rewite Eq. (E. 1) in a form that isolates p . as 
1 x 

I [p> \\ n] = jdxx^Jdy ( 1 - x + y) ^ ( Jd Pl _ 12 5(Z 1 _ 12 p. - (x - y) ^ _ ^ 
o o 

x Jdp 12 5(Z 12Pi - y)n i2 p"' x jdp 2 _ 12 8(E 2 _ 12Pi - ( 1 - x + y) )II 2 _ 12 p"' ) , 

(E.2) 

where here the subscript notation indicates the sets of indices involved, e.g. 1-12 indicates 
i e a j _ 12 . Each of the three integrals over p in Eq. (E.2) may be done in closed form. Do these 
integrals using Thm. 9a and induction to find 

1 X 

x Jdxx 11 ' Jdy (x - y) Pl - 12 " V 12 " 1 ( 1 - x + y) ^ + p2 ~ 12 " 1 . 
o o 

(E.3) 

Apply the binomial theorem in (E.3) to expand two of the three factors in the integrand, 
(x - y) Pl _ 12 1 and ( 1 - x + y ) ^ + ^ 2 ~ 12 1 , in series. Using these series, note that each term in the 
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- 1 



series for (x - y) 1-12 will contribute the same power of x after the integration over y, while 

the terms of the series for ( 1 - x + y) 1,2 + ^ 2 ~ 12 1 contribute increasing powers of x after the inte- 
gration over y. Note also that if the lowest-power-of-x term is integrable over x in a region con- 
taining 0, then all terms are. Thus, the worst case occurs with the constant term from the binomial 

series for ( 1 - x + y) ^ + ^ 2 ~ 12 1 . After integration over y with this constant term, and considering 
the small x region of integration, we are left with the integral over x given by 



where < x < 1 , and C is a constant. This integral exists for Re (P 1 + r\ { ) > 0. This, symmetry, 

and the first condition (given by Re (Vj) > 0, i = 1, m) establish the result. The method for 

more complicated overlap structures is also indicated by this discussion. 

The discussion above is of interest in another way: it provides a general method for finding 
multiple overlap integrals without the use of transform theory. 

F. DERIVATIVES OF OVERLAP CONVOLUTIONS - POLES 

In this appendix we find derivatives with respect to T| of expressions such as T(k - r|)/r(-T|) , 
where k e { 0, 1, . . . } , and T| may be any number within the constraints of existence. We consider 
the various cases that arise when some combination of poles occurs and demonstrate the various 
simplified expressions for the derivatives in these cases. 

When T| is not an integer, there are no poles in either T(k - T|) or r(-T|) and the usual deriva- 
tive expressions hold. When T| is an integer and T| < 0, the usual expressions also hold since k > 
and therefore k - T| > . The case where the usual expressions hold will be denoted as case 0. 

When T| is an integer and T| > 0, there are two cases are of interest. The first, case 1, occurs 
when T| > and k - T| > 0, so that there are poles in the denominator of T(k - T))/r(-T|) only. The 



X 




(E.4) 



o 
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second, case 2 occurs when r] > and k - T| < 0, so that there are poles in both the numerator and 
the denominator. 

In order to find expressions for the derivatives in cases 1 and 2 we use the following facts. 1) 
The only singularities of the gamma function T(x) are simple poles at x = -n, n = 0, 1, ... with 

residues (-l)Vn! respectively. 2) A$ (n) (k-ri,-ri) = (-1) n_ 1 r(n)L k = - 1 — when- 

1 (k — i — T[) 11 

ever the expressions exist. (The identity T(k - r|)/r(-T)) = H^ = 1 (k - i - T|) may be used in de- 
riving this.) 3) T(k - r))/r(-r|) is the representation (away from the poles in the gamma functions) 
of an everywhere- analytic function (note that k > is still assumed). Using these facts, the expres- 
sions for cases 1 and 2 are found by substituting £ = T] + 8 for T| (now restricted by the conditions 
of cases 1 and 2 to be a nonnegative integer) in the corresponding case expressions and taking 
the 8 = limit. 



Case 0: T| non-integer, or T| an integer with T| < and k - T| > . There are no poles in the 
numerator or denominator. The first derivative is given by 



a 1 



T(k-Tfl' 

n-Ti) 



r(k-Ti) m 

1 A4> (1) (k-ri,-ri). 



(F.l) 



r(-ri) 

The r derivative may be found by iteration, using Eq. (F.l) and the recursion relation 

(F.2) 

For example, taking the derivative of Eq. (F.l), applying Eq. (F.2) with Eq. (F.l) in the 
process, yields the second derivative 



ai$ (n) (k-ri) = -4> (n + 1) (k-ri). 



r(k-Tp- 
n-Ti) 



[ A<D (1) (k - ti, -ri) 2 + A$ (2) (k - ri, -ri)] 



(F.3) 



Case 1: T| an integer, r\ > and k - T| > . The denominator contains a pole. 

The zeroth derivative is 0. Taking the appropriate limit in Eq. (F.l) gives us the first 
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derivative 

(-l^ + Vnk-Ti), (F.4) 
while taking the appropriate limit in Eq. (F.3) yields the second derivative 
(-D^riink-rDE^ ^^(k-i-rir 1 . 

(F.5) 

Case 2: T] integer, r\ > and k - r\ < 0: Both the numerator and denominator contain poles. 

(-l) k r|! 

The zeroth derivative is simply — — j-yj . Taking the appropriate limit in Eq. (F.l) 
gives us the first derivative 

(- 1 ) k+1 ^ ! ]c) l A ^ (1) ( k - T 1.- T l)' (F-6) 
while taking the appropriate limit in Eq. (F.3) yields the second derivative 

k (n-k)! [A * ( ^ (k ~ T1, _ri)2 + A * (2) (k ~ T1 * _ri)] • (R7) 

G. MULTIPLE OVERLAP RESULTS 

Here we define the general Convolution Form (CF) notation and demonstrate the results stated 
without proof in Sec. 2e. Let k = (kj, k n ) be a vector of nonnegative summation indices, 

(with n the number of such indices) and let C(- ) be any function of an n- vector of nonnegative in- 

k in- 
tegers. Further, let t =n? = jtj', t a = E? = 1 a i t i , where a = (a 1; a n ) with components in the 

reals, and let oc k be a complex number indexed by k. 

Define the CF symbol (a;b) to be the set of expressions of the form 

exp(-Tt a )£ k C(k)T a '<- 1 t k , (G.l) 
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where x is known as the convolution variable and where the occurrence indicator vector b has 

components in {0, 1 } and bj = 1 if and only if t ; occurs in the summation E k C(k)x ak 1 t k , i.e. 
bj = 1 iff 3 k 3 C(k) ^ and kj > 0. All the aspects of C(k) that are relevant to the analysis of 
Sec. 2e are expressed by the vector b. A particular member of the CF (a;b) will be written as 
(a;b) , i.e. u indexes the CF (a;b) . 

In order to represent convolutions of CF expressions, let [(a;b), (c;d) ] be the set of all ex- 
pressions that result from the convolution of two CF members from (a;b) and (c;d) respective- 
ly, i.e. 

[ (a;b) , (c;d) ] = { (a;b) u (x) (c;d) w (x - x) dx} , (G.2) 
where by (a;b) u (x) we mean (a;b) with its convolution variable evaluated at x, and where u 

and w range over all indices for the CF's (a;b) and (c;d) respectively. 

Now we show that the result of the convolution of any two members of CF's may be written 
as a member of some CF. More precisely, we prove Eq. (13) of Sec. 2e, 

[ (a;b) , (c;d) ] c (c ; b v d v nz(c - a)) . (G.3) 

To prove Eq. (G.3) start by noting that each integrated convolution in Eq. (G.2) has the repre- 
sentation (see App. A, Eq. (A. 2)) 

-it ik a+ a -i r (a.)r(a k ) 

e Yk C P) tJt ^ 1 T(a + a ) X i F i (a j ;a j + a k ;Tt c-a)- ^ 

In Eq. (G.4), as before, j and k are vectors of nonnegative summation indices, C(- , ■) is a function 
of the vectors j and k (specifically, it is the product of the C's appearing in the two CF's (a;b) u (x) 

and (c;d) w (x)), t j = n? = 1 t i tj]i (and similarly for t k ), t c = I* = ^ and t c _ a = E? = 1 (Cj - a { ) t { . 
Now, expand the hypergeometric ] F l in the series (see App. A) 

r(cc. + cc k ) (aPi (xt c _ a )i 

. F i (a iV^Va) = ne^^o (a . + ak)i r • (G.5) 
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Substitute Eq. (G.5) into Eq. (G.4) and note the following four things: (1) if bj = 1 then t { occurs 
in Eq. (G.4), (2) if d ; = 1 then tj occurs in Eq. (G.4), (3) if C; - a { * then tj occurs in Eq. (G.4), 
and (4) if none of the cases 1-3 hold then tj does not occur in Eq. (G.4). Thus, the occurrence in- 
dicator vector for the result of the convolution Eq. (G.4) is (c ; bvdvnz(c-a)). Noting also 
that the constant in the exponential's argument is t c establishes Eq. (G.3). 

Finally, Eq. (14) of Sec. 2e follows immediately from the definition of the CF in Eq. (G.l) and 
the properties of the T transform in App. A. 
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FOOTNOTES 

1 An alternate proof of Thm. 12 leads to an identity. Where the inverse T transform is applied in 

the proof of Thm. 12, instead find the convolution (x^ 1 *e Tt ®x^~ l ) using Thm. 10.2 and ex- 
press it in terms of 1 F 1 . Now, do the inverse transform and equate this result to the result of 

r(c)r(c - a - b) 

Thm. 12 to find Gauss's identity: ,F.( (a, b) ;c;l) = — -=- — . 

2 1 r(c-a)r(c-b) 



An alternate proof of Thm. 14a leads to another identity. Instead of applying Thm. 11, apply Thm. 
10.1 to the first two terms of the pairwise overlap convolution p^'~ 12 *e pt '®p^ 12 *e p(t » + t2) 

and immediately take the inverse Tj transform. Now, do the final convolution with p^ 212 *e pt2 
and take the inverse T 2 transform. Note that there is only a single summation in the result, whereas 
in the result in Thm. 14a there are two summations. On the other hand, the convolution 
pPi2 J e P< t i + t 2)0pP2-i2 ! e pt2 can b e taken first, followed by a convolution with p^ 1_ 12 *e pt ', 
effectively interchanging indices 1 and 2. Equating these two single-sum forms gives the identity 

r(b 1 )r(b 2 )r(b 1 +b 2 -(a 1 + a 2 + a 3 )) 
3 ^ 2 La 1 ,a 2 ,a 3 ,b 1 ,b 2 ,lj y^Y^^^ - ( &1 + a 2 ) )r(b 1 + b 2 - (a 2 + a 3 )) 

x 3 F 2 [bj - a 2 , bj + b 2 - (a { + a 2 + a 3 ) , b 2 - a 2 ;b 1 + b 2 - (a 1 + a 2 ) , b l + b 2 - (a 2 + a 3 ) ; 1] , 

while equating either of the single-sum results just described to the original result of Thm. 14a 
yields Gauss's identity, discussed in footnote 1. 



Utilizing Gauss's identity (see footnote 1 and [9], Eq. 15.1.1) provides further simplification in 
Thm. 15a for cases 15a.2 and 15a.4. These simplifications are due to simplifications appearing in 

F ( 10) and F (01) respectively. The choice of the form of the results presented was made consider- 
ing the simplicity of the results and consistency between the results. 
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